Abstract: In Kenya’s social media, Nairobi Swahili is the norm for communication in institutions of higher learning. Extant studies dwells on standard Swahili, affording limited text classification literature for Nairobi Swahili Natural Language Processing. The research explores how social media experience provides new ways for interactions, resulting into new challenges in managing student concerns that now require new knowledge for decision-making. Students have taken advantage of social media platforms by creating virtual discussion forums, which are quickly becoming repositories of collective knowledge. Unfortunately, institutions of higher are not able to utilize the collected knowledge through these platforms. The focus of this research is to ensure knowledge generated via social media is useful through opinion mining to enable extraction, classification and storage to support decision-making. Different algorithms were tested utilizing data from popular social media; operated by students in Kenyan universities. The results showed that SVM gives the best results when used with Linear Kernels and better performance on TF-IDF with N-grams methods. An analysis on the different SVM kernel showed linear kernel to have a better performance at 80% compared to Polynomial kernel and Radial Basis Function kernels, which both stand at 57%. To choose the best feature selection method for use along with linear SVM, TF and TF-IDF were tested. TF-IDF performed better with N-grams at 83%; rendering this research both theoretical and practical significance. The research would provide fast hand information for decision support in Kenyan higher learning institutions using text-mining tools in social media.
Keywords: Nairobi Swahili, Support Vector Machines, Feature Selection, N-grams